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Abstract 

We give a thermodynamic interpretation of the moment map for 
' toric varieties. The convexity properties of this map correspond to 

thermodynamical principles (concavity of the entropy functional) ap- 
^ ' plied to a system with several Hamiltonians. 

» 

Introduction. 

This elementary note is an exercise in generalizing the standard (Maxwell- 
^ ' Bolztmann-Gibbs) approach to thermodynamics, to the case when the en- 

, ergy function is vector valued. This case is similar to that of several com- 

muting Hamiltonians, familiar in the theory of integrable systems. 
' It turns out that the mean energy function of such "higher-dimensional 

. thermodymanics" is basically the same as the moment map in the theory of 

I toric varieties. Low-temperature limit of the standard theory generalizes to 

the "tropical limits" near vertices of the convex polytope given by the image 
of the moment map. Convexity properties of the moment map correspond 
to fundamental thermodynamic principles such as concavity of the entropy 
^ ' functional. 

■ Relation of thermodynamics with tropical geometry have recently begun 

to attract some interest from various directions [31 [5]. In particular, the 
observation that tropical geometry corresponds, thermodynamically, to the 
low-temperature limit (in contrast with the name "tropical" which suggests 
the opposite) has been made by I. Itenberg and G. Mikhalkin in [3j. Consid- 
eration of vector inverse temperature as in this note, makes this observation 
even more clear. 

I am gratefiul to G. Mikhalkin for stimulating discussions. 
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1 Reminder on usual thermodynamics (one Hamil- 
tonian). 

We start by reviewing the standard material on statistical thermodynamics, 
see [H E], with some extra emphasis on logical structure. 

A. The Gibbs distribution. Consider a thermodynamical system such 
as a gas, with the (large but finite) set of states A. According to the fun- 
damental principle of Boltzmann, the statistical behavior of the system is 
completely determined by the knowledge of the energies of various states, 
i.e., by the choice of a function E : A M. To find this behavior, one forms 
the Boltzmann partition function 

(1.1) Z{P) =^e-^-^H, f3 = ±.^ 

where T is the absolute temperature and k is the Boltzmann constant. This 
is a finite sum of exponents, so it is well-defined and positive for all real 
values of /3. The Gibbs probability distribution is given by 

(1-2) pM = -^7o^^ = 1- 

The interpretation of Ptj(/3) is as follows. Let us heat the system to temper- 
ature T = 1 /kf3 and wait till it arrives at a "thermodynamical equilibrium" . 
Then PuiP) is the probability that the system is at the state oj. 

By an observable we mean simply a function O : A M. The Gibbs 
distribution gives rise to the mean value of O, which is a function 

(1.3) (O) = {om = Y^pM-Oiu). 

In particular, we have the mean value of the energy 

Let -Emiri) -E'max be the minimal and maximal values of E. For simplicity 
assume that each of these values is attained at one state: f^miiD '^max- 

In 

the low temperature limit /3 +oo the value {E){/3) approaches ii^mim as 
Pajmin(/3) approaches 1. The state Wmin usually has a clear physical meaning 
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and is called the ground state. Similarly, in the otheio limit /3 ^ — oo we 
have that {E){(3) E^g.^. In fact, we have 

Proposition 1.5. The function {E){f3) defines a monotone decreasing dif- 
feomorphism from M to the open interval (-Emirn -^max)- 

Proof: We need only to show that {E){/3) is monotone decreasing. But 
= ^ E {-E{ur + E{u^)E{u:'))e-^^''^-)^^^-'^l 

For an unordered pair {oj ^ to'} the two coefficients at )) gum 

to -{E{u;)-E{uj')f < 0, while for ljj = oj' the coefficient vanishes. So unless 
all the E{uj) are equal to each other, the derivative is strictly negative. □ 

B. Derivation of the Gibbs distribution. For future convenience, we 
sketch here the classical derivation of ()1.2p . As many thermodynamical 
arguments, it assumes two levels of micro scopicity. That is, although the 
set A is already supposed to be very large, |^| 3> 0, and involve microscopic 
degrees of freedom, we now assume that we have a much larger number 
N ^ 1^1 of "truly microscopic" particles which can be distributed among 
the states uj (z A, possibly many at a time. Each such way of distributing 
particles is called a microstate q We further assume (this corresponds to 
the classical and not quantum approach to the problem) that the particles 
distributed are distinguishable from each other. This means that we can 
think of microstates as being sequences ^ = {Cit--,Cn) of elements of A, 
forming the Cartesian power A^ . 

Since we want to determine a probability distribution (measure) on A, 
consider first the space A"^ of all such measures. This is a simplex of dimen- 
sion |^| — 1. Fix an arbitrary p = (plj)ljgA ^ and equip A^ with the prod- 
uct measure. For a microstate ^ € A'^ as above let iV(^(C) = \{i : = uj}\ 
be the number of particles in the state to, so the observed probability of 
being in the state oj (observed at ^) is quj{0 — ^uj{0/^- Fix a partition 
N = J2weA ^Lo, IT'LL) ^ Then the set of ^ € A'^ such that N^{C) = has 

^This is not the high temperature Umit but rather the non-physical limit of T ap- 
proaching from the negative direction. The limit T — > +oo corresponds to /? ^ 0, when 
all the states become equally probable. 

^ The "thermostat" in standard discussions of equilibrium thermodynamics is a device 
for producing these microstates. 
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the measure (probability) 

(1.6) m n ^, 



as the measure of any single such ^ is pS" i while the number of these ^ is 
the multinomial coefficient. Using the Stirling approximation 

(1.7) log(n!) ~ n(log(n)-l), n > 0, 

we approximate the logarithm of (|1.6p by 

Notice the following fact. 

Lemma 1.8. Let p G A"^ be given. Then 



9a;(log(P^) - log(ga;)) = 0, 



and the maximum is achieved for q = p- 

Proof: The function of g G which we seek to maximize, is concave, 
approaches — oo at the boundary and has a critical point at q = p, which 
must then be the absolute maximum. □ 

Thereforejfl the most probable microstates will be those ^ for which each 
QuiiO = Pu- Such ^ are called equilibrium microstates. In the case when the 
Plx) = n^j/N are rational, their number is the multinomial coefficient, which 
we interpolate for arbitrary p G A"^, using the Gamma function, by 

r(A^ + 1) 

(1.9) Number of equilibrium microstates - 



Using the Stirling formula, we approximate the logarithm of (jl.9p by 
iV(log(iV) - 1) -^NpUlog{Np^) - 1)) = -NY,Pulog{p^). 



^This is, essentially, the law of large numbers of probability theory. Up to now, con- 
sideration of microstates was formally identical with that of independent trials of a 
random event such as a roll of dice. 
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Recall that for a probability distribution p G its entropy is defined as 
(1-10) S{p) = - J^p^log(p^). 

The function is a concave function on the simplex A"^, equal to at the 
boundary, and achieving the maximim at the barycenter. Thus, thermody- 
namically, 

^ loEcfNumber of equilibrium microstates) 

(1.11) S « ^, N^\A\. 

Now, the main thermodynamic principle used to deduce the Gibbs dis- 
tribution is that the number of equilibrium microstates should be as large as 
possible, while maintaining the desired mean value of energy. That is, take 
a point E S (-Emin, -E-max) and look at all probability distributions p G A"^ 
satisfying 

(1.12) (E)^ := Y.p^E{u) = E. 



The above principle implies (|1.2p in virtue of the following fact. 

Proposition 1.13. Among the distributions p satisfying (jl.l2p . the maxi- 
mal entropy is achieved by the Gibbs distribution p{P), where (3 M is the 
unique number such that {E){/3) as defined by (jl.4p . is equal to E. 

Proof: The constraint (|1.12p defines a hyperplane section of the simplex A"^, 
a convex polytope, denote it P. The restriction S\p is a strictly concave 
function, equal to at the boundary. So it has a unique critical point 
inside P, and this point is the global maximum. By the Lagrange multiplier 
method, this critical point is characterized as a point p G A"^ which satisfies 
the constraint (lies in P) and at which the differential of S is proportional 
to the differential of the constraint. On A"^ we have Yluj '^P^ ~ therefore 
dS = — J2u} ^osiPoj)dpuj, and the condition of proportionality reads: 

(1.14) -^log{Pu>)dpuj = X^E{oj)dp^. 

But this condition is satisfied by p^j = Puj{P) as defined in the statement of 
the proposition, with A = /?. Indeed, logP(^(/3) = —/3E{u:) — log Z{f3), so in 
virtue of dp^^ = 0, the LHS of (I1.14p is equal to /3 ^ E{ijj)dp^^. □ 
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C. Entropy, energy and temperature. One can object that the above 
derivation of the Gibbs distribution (jl.2p is somewhat circular. It does 
explain, from clear principles, the behavior of p = {pui)ui£A as a function of 
the mean energy E, but not of /3 or of temperature. In fact, E and /3 are 
supposed to be related by the formula ()1.4p which depends on ()1.2p . This is 
not surprising since we have not used any meaningful features of the concept 
of "temperature". 

A mathematically satisfying way of dealing with this issue is to consider 
the temperature as a secondary quantity, and to define it in terms of more 
fundamental quantities such as energy. A standard definition like this (see, 
e.g., [4]) says that the inverse temperature (i.e., /3) is "the derivative of 
the entropy with respect to the energy". Mathematically, this definition 
(or, rather, its consistency) amounts to the following general fact about 
exponential sums. 

Proposition 1.15. Consider (3 as a function of E £ (^'mirn -E'max) by in- 
verting the diffeomorphism of Proposition li.5l Let S{E) be the entropy of 
the Gibbs distribution p{f3(E)). Then dS/dE = (i{E). 

In particular, S{E) is a concave function on [-Emini -E'max]) equal to at 
both ends, with the derivative at these points being ztcxo. 

Proof: This is an immediate consequence of the identification of /3 with 
the Lagrange multiplier in (I1.14p . Indeed, for any constrained maximum 
problem xn.siyig(^x)=c f {^) the value of the Lagrange multiplier A = A(c) has 
the interpretation as the derivative, with respect to c, of the constrained 
maximum value (this derivative is called, in the language of applied math, 
the "effective price of the resource represented by the constraint", see, e.g., 
0). □ 

2 Thermodynamics with several Hamiltonians. 

A. The Gibbs distribution for several Hamiltonians. We now as- 
sume that the set of states A is equipped with not one, but several "energy 
functionals" Ei,...,En : ^ — )• M, which we combine into one vector valued 
function E : A M". To these energy functionals there correspond n "in- 
verse temperatures" /3n, which we combine into one vector quantity /3 
lying in the dual space R"*. 

For simplicity we assume that E defines an embedding of A into M". We 
can then think of A as being a subset of M" to begin with, and sometimes 
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drop E from the notation, thinking of it as just the inclusion map. With 
these conventions, we write the partition function and the Gibbs distribution 

p-(/9,<^) 

(2.1) Z(/3) = ^e-(^'-), pM = ^7^, 

As in (jl.3p . the Gibbs distribution can be used to define the mean value 
of any observable O on A. In particular, taking for O the vector valued 
function (embedding) E : A ^ M", we have the mean energy map 

(2.2) {E):P ^{Em = . = -V^logZ(/3). 

Here V/j means the vector of gradient with respect to /3, i.e., the differential 
of a function considered as a vector in the dual space. Thus (E) : R"* 
M". Let Q C M" be the convex hull of A, and Q° be the interior of Q. 
Since (p(^(/3))ajGA is a probability distribution on A with all components 
nonzero, we see that (E) maps M"* into Q° . We can now generalize the 
thermodynamic formalism of the previous section as follows. 

Proposition 2.3. (a) The map {E) : M"* —^Q°isa diffeomorphism. 

(h) For any E ^ Q° let P-^ be the set of probability distributions p S A"^ 
satisfying the constraints 

(E)p ■= T.p-'-^ = ^- 

uieA 

Let /3{E) € M"* be unique such that {E){f3) = E. Then the Gibbs distri- 
bution p{l3) = {puj{f3)) defined above, has maximal entropy among all the 
distributions from P-^. 

(c) Let S{E) be the entropy of the distribution p{f3(E)) . Then 'V^S{E) = 
P(E). 

(d) The functions — log Z{/3) on M"* and S{E) on Q° C M" are concave 
and are the Legendre transforms of each other. 

Note that part (b) shows that the "vector" Gibbs distribution (12. ID has 
the same thermodynamic significance as the more standard one ()1.2p . 

The proof of the proposition will be given later in this section. 
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B. Example: toric varieties and the moment map. Assume that A 
lies in Z" C M". The exponential e~^^'^\u} € A, then becomes a Laurent 
monomial -z*^ = 11 ■2^' ™ variables Zi = e~^' . Real values of /3 correspond 
to z G , where M+ is the set of positive real numbers. 

The monomial z^ makes sense for any z G (C*)". Consider the complex 
vector space C"^ with basis eaj,uj €^ A and let P"^ be its projectivization. A 
vector of C"^ is thus a tuple {a^)^(zA- The torus (C*)" acts on C"^ and 
by 

Z ' Gjjj — Z €^ . 

In particular, we consider the orbit X'^ C P"^ of the point represented by 
1 = (l)tJGA G and let Xa C P"^ be the projective toric variety defined as 
the closure of X^. 

Assume for simplicity that A generates as an affine lattice, i.e., there 
is no smaller integer affine sublattice in Z" containing A. Then the action 
of C" on P"^ is faithful, in particular, the action map 

Z I > Z l = (z'^)a;GA 

identifies (C*)" with X^. Let X^ C X^ be the image of Rl C (C*)". This 
image is known as the positive part of the toric variety Xa- Clearly, X^ 
consists of the points of the form x(/3) = {e~^f^''^^)^(zA for all /3 G M"*. 

The action of the compact part (S^)^ of the torus (C*)" on the projective 
space P"^ preserves the standard Fubini-Study Kahler metric and gives rise 
to the moment map 



(2.4) /xp : P^ ^ R", (a^) 



uieA 
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see, e.g., [2]. The image of this map is the polytope Q = Conv(A). Let fi^ 
be the restriction of to X^. Using the above parametrization of X^ by 
the we write as a map from M"* to Q, and find that 

(2-5) = ^f^^ ^-(.,,.) = {E)m 

is nothing but the mean energy of the twice scaled /3, with respect to the 
Gibbs distribution (j2.ip . Proposition 12.3( a) reduces then to the well known 
fact about toric varieties: that the moment map defines a diffeomorphism 
from the positive part to the interior of the defining polytope, see [I], [2]. 
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C. Direct and inverse images of concave functions. To give a natural 
proof of Proposition 12.31 we start with some general remarks. By a convex 
body we will mean a convex subset P of some finite-dimensional affine space 
V over M. For such P we denote by Conc(P) the space (semigroup) of 
concave functions / : P ^ M which are proper, i.e., such that each level set 
/~^(c) is compact. Any such function achieves a maximum on P. 

By an admissible embedding of convex bodies i : P' — >■ P we mean 
an injective map induced by an affine embedding of ambient affine spaces 
V' ^ V, so that P' = V n P. In this case for any / € Conc(P) we have 
the inverse image (restriction) i* f = f\pi which again lies in Conc(P'). 

Similarly, by an admissible surjection of convex bodies j : P P" we 
mean a surjective map induced by an affine surjection J : A' ^ A" of 
ambient affine spaces. In this case for any / € Conc(P) we have the direct 
image which is the function j*/ on P" defined by 

(2.6) {j*f){p") = max /(p). 

3{p)=P 

Example 2.7. One can take P = y to be a finite-dimensional vector space 
over R and / to be a negative definite quadratic form on V . Then, for 
any linear surjection j : V ^ V" , the direct image j^f is a negative definite 
quadratic form on V". The integration, along the fibers of j, of the Gaussian 
function e-^^^^ on V gives, up to a constant, the Gaussian function e^^*^^^^"^ 
on V". 

For a general / G Conc(P) and an admissible surjection j : P ^ P" the 
function e^*^ is the leading term, as /i — )• 0, of the function on P" obtained 
by integrating e^^^^^/'* along the fibers of j. 

The following is then elementary. 

Proposition 2.8. (a) The function j^f belongs to Conc(P"). 
(b) (Base change) Let 

^2— Pi 

32 ji 
p/ i' ^ p/ 

be a Cartesian square of convex bodies, such that i,i' are admissible embed- 
dings and ji,j2 are admissible surjections. Then for any f G Conc(Pi) we 
have the equality {ji)*i* f = {i')*{j2)*f of concave functions on P!^- □ 
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D. Proof of Proposition [2T3l Consider the admissible surjection of con- 
vex bodies 

(2.9) ^ : ^ Q, p ^ {E)^ = Y^p^-u:. 

Fix E S Q° . The fiber 'k~^{E) is the set P-^ of part (b) of the proposition. 
Consider the entropy function S G Conc(A^) defined by (ll.lOp . It is strictly 
concave, so the restriction of S to Ti^^{E) achieves maximum at a unique 
interior point; denote this point p{E). This point is, furthermore, the unique 
critical point of S on 7r~^(i?). Consider also the direct image function 7r*5 G 
Conc((3). 

The location of the critical point can be found by the Lagrange multiplier 
method for n constraints: the differential of S at 'p{E) should be a linear 
combination of the differentials of the individual scalar constraints, i.e., to 
have the form (A,d7r) for some A € M"*. Further, we have the n-constraint 
interpretation of the Lagrange multipliers as minus the partial derivatives of 
the maximal value with respect to the constraints, see again [6]. This means 
that A = \{E) is equal to the gradient (differential) of the strictly convex 
function of 7r=i,5 at the point E. 

Next, look at the Gibbs distribution p[\[E)). We see that p{\{E)) is 
a critical point of S on 7r^^(i?), so it is equal to p{E). This implies that 
X{E) = f3{E) is the inverse to the map /? i— > {E){f3) which is therefore a 
diffeomorphism, thus proving part (a) of the proposition. In particular, the 
function S{E) of part (c) is the same as Tr,,,^. Since p{\{E)) = p{E), this 
implies part (b). Part (c) follows since X{E) = V-^{'ir^:S) by definition. Fi- 
nally, the Legendre transform relation between the functions — log Z(I3) and 
S{E) = TT^S in part (d) is equivalent to the fact that their gradients define 
mutually inverse diffeomorphisms, as we have shown that {E) = V^(— log Z) 
is inverse to V;g-(7r*S'). □ 
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